Repair Time in Distributed Storage Systems

نویسندگان

  • Frédéric Giroire
  • Sandeep Kumar Gupta
  • Remigiusz Modrzejewski
  • Julian Monteiro
  • Stéphane Pérennes
چکیده

In this paper, we analyze a highly distributed backup storage system realized by means of nano datacenters (NaDa). NaDa have been recently proposed as a way to mitigate the growing energy, bandwidth and device costs of traditional data centers, following the popularity of cloud computing. These service provider-controlled peer-to-peer systems take advantage of resources already committed to always-on set top boxes, the fact they do not generate heat dissipation costs and their proximity to users. In this kind of systems redundancy is introduced to preserve the data in case of peer failures or departures. To ensure long-term fault tolerance, the storage system must have a self-repairing service that continuously reconstructs the fragments of redundancy that are lost. The speed of this reconstruction process is crucial for the data survival. This speed is mainly determined by how much bandwidth, which is a critical resource of such systems, is available. In the literature, the reconstruction times are modeled as independent (e.g., poissonian, deterministic, or more generally following any distribution). In practice, however, numerous reconstructions start at the same time (when the system detects that a peer has failed). Consequently, they are correlated to each other because concurrent reconstructions do compete for the same bandwidth. This correlation negatively impacts the efficiency of the bandwidth utilization and henceforth the repair time. We propose a new analytical framework that takes into account this correlation when estimating the repair time and the probability of data loss. Mainly, we introduce a queuing model in which reconstructions are served by peers at a rate that depends on the available bandwidth. We show that the load is unbalanced among peers (young peers inherently store less data than the old ones). This leads us to introduce a correcting factor on the repair rate of the system. The models and schemes proposed are validated by mathematical analysis, extensive set of simulations, and experimentation using the GRID5000 test-bed platform. This new model allows system designers to operate a more accurate choice of system parameters in function of their targeted data durability. ? The research leading to these results has received funding from the European Project FP7 EULER, ANR CEDRE, ANR AGAPE, Associated Team AlDyNet, project ECOS-Sud Chile and région PACA. 2 Authors Suppressed Due to Excessive Length

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Hybrid Regenerating Codes for Distributed Storage Systems

Distributed storage systems are mainly justified due to their ability to store data reliably over some unreliable nodes such that the system can have long term durability. Recently, regenerating codes are proposed to make a balance between the repair bandwidth and the storage capacity per node. This is achieved through using the notion of network coding approach. In this paper, a new variation ...

متن کامل

A Non-MDS Erasure Code Scheme for Storage Applications

This paper investigates the use of redundancy and self repairing against node failures indistributed storage systems using a novel non-MDS erasure code. In replication method, accessto one replication node is adequate to reconstruct a lost node, while in MDS erasure codedsystems which are optimal in terms of redundancy-reliability tradeoff, a single node failure isrepaired after recovering the ...

متن کامل

Evaluation of Energy Storage Technologies and Applications Pinpointing Renewable Energy Resources Intermittency Removal

Renewable energy sources (RES), especially wind power plants, have high priority of promotion in the energy policies worldwide. An increasing share of RES and distributed generation (DG), should, as has been assumed, provide improvement in reliability of electricity delivery to the customers. Paper presented here concentrates on electricity storage systems technologies and applications pinpoint...

متن کامل

Evaluation of Energy Storage Technologies and Applications Pinpointing Renewable Energy Resources Intermittency Removal

Renewable energy sources (RES), especially wind power plants, have high priority of promotion in the energy policies worldwide. An increasing share of RES and distributed generation (DG), should, as has been assumed, provide improvement in reliability of electricity delivery to the customers. Paper presented here concentrates on electricity storage systems technologies and applications pinpoint...

متن کامل

Capacity of Wireless Distributed Storage Systems with Broadcast Repair

In wireless distributed storage systems, storage nodes are connected by wireless channels, which are broadcast in nature. This paper exploits this unique feature to design an efficient repair mechanism, called broadcast repair, for wireless distributed storage systems in the presence of multiple-node failures. Due to the broadcast nature of wireless transmission, we advocate a new measure on re...

متن کامل

An Empirical Study of the Repair Performance of Novel Coding Schemes for Networked Distributed Storage Systems

Erasure coding techniques are getting integrated in networked distributed storage systems as a way to provide fault-tolerance at the cost of less storage overhead than traditional replication. Redundancy is maintained over time through repair mechanisms, which may entail large network resource overheads. In recent years, several novel codes tailor-made for distributed storage have been proposed...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2013